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Abstract. This paper addresses the issues of conservativeness and computational complexity of prob- 
abilistic robustness analysis. We solve both issues by defining a new sampling strategy and robustness 
' measure. The new measure is shown to be much less conservative than the existing one. The new sampling 

C ^ 3 strategy enables the definition of eflicient hierarchical sample reuse algorithms that reduce significantly 

the computational complexity and make it independent of the dimension of the uncertainty space. More- 
over, we show that there exists a one to one correspondence between the new and the existing robustness 
measures and provide a computationally simple algorithm to derive one from the other. 

in 

Ph . 1. Introduction 

< 

Robustness analysis is used to predict if a system will perform satisfaetorily in the presence of uncer- 
' tainties. It is generally accepted as an essential step in the design of high-performance control systems. 

. In practice, the analysis has to be very efBcient because it has to use models as realistic as possible and, 

usually, it takes many cycles of analysis-design to come up with a satisfactory controller. The outcome of 
^ , the robustness analysis should allow the designer not only to evaluate the robust performance of a con- 

' troUer, but also to compare various controllers in order to obtain the best control strategy. Needless to 

(N . 

say, unnecessary conservativeness prevents a realistic analysis. 
■ Aimed at overcoming the computational complexity and conservatism of the classical deterministic worst- 

— I , cast approach, there are growing interests in developing probabilistic methods and randomized algorithms 

I (see, [II]-[Ii] and the references therein). Specially, a probabilistic robustness measure, referred to 

as the confidence degradation function or robustness function is proposed in [J- Such robustness measure 
has been demonstrated to be much superior than the classical deterministic robustness margin in terms of 
conservatism, computational complexity and generality of application. 

The computation of the robustness function using Monte Carlo simulations requires uniform sampling 
I from bounding sets in the uncertainty space, which can reach high dimensions very quickly; for example if 

the uncertainty is modelled by a 5 x 5 complex-valued matrix then the dimension of the uncertainty space 
is 50. We will show here that such sampling suffers from what we term surface effect and may introduce 
undue conservativeness in the evaluation of system robustness. We address this conservativeness with a 
new sampling technique and a new probabilistic robustness measure that is significantly less conservative. 
Moreover, with a suitable computing structure it can be evaluated for arbitrarily dense gridding of un- 
certainty radius with a computational complexity that is very low and is independent of the dimension of 
uncertainty. 

We shall use the following notation throughout this paper. The uncertainty is denoted as boldface A and 
its realization is denoted as A. The probability density function of A is denoted as f^. We measure the 
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size of uncertainty by a function ||.|| which has the scalable property that ||p^|| = p||^|| for any uncertainty 
instance A and any p > 0. Obviously, the most frequently used i?oo or Ip norm of uncertainty possesses 
such scalable property. The uncertainty bounding set of radius r is denoted a,s Br — {A : \\A\\ < r}. We 
use dBr to denote {Z\ : ||Z\|| = r}. Specially, B denotes {Z\ : ||Z\|| < 1} and dB denotes {zi : ||Z\|| = 1}. 
For a subset Sr of dBr, its "area" is defined as 



where "/" denotes the multivariate Lebesgue integration and the down arrow "|" means "decreases to" . 

The indicator function I(.) means that I(^) = 1 if the robustness requirement is guaranteed for A and 
I{A) ~ otherwise. The probability of an event is denoted as Pr{.}. The conditional probability is denoted 
as Pr{. I .}. The set of complex number is denoted as C. The set of real matrices of size m x p is denoted 
as R™^^'. The set of complex matrices of size m x p is denoted as C™^^. The real and complex parts of 
a number is denoted as 3?(.) and respectively. The largest and the second largest singular values of 
a matrix are denoted as and <T2{-) respectively. The ceiling function is denoted as [.] and the floor 
function is denoted as [.J . 

1.1. The Surface Effect of Uniform Sampling. In order to illustrate the surface effect, consider a 
uniform sampling extracting samples from the uncertainty set Br- Let Ep denote the event that a sample 
chosen uniformly from Br lies outside the bounding set Bp of radius p < r. Under the assumption of 
uniform distribution it is easy to see that such event will have the probability Pr{_Ep} = 1 — (^)'' where 
d is the dimension of uncertainty. As d increases this probability approaches one for all p < r. For 
example when ^ = 0.9 and d = 50 then Pr{£'p} = 0.9948. Hence out of 1000 samples extracted 
uniformly from the bounding set of radius r one would expect that about 995 will be outside 
the bounding set with radius p = 0.9r. If the uncertainty is well modeled one can reasonably assume 
that large uncertainties are less likely than small ones and we are faced with the fact that the uniform 
sampling selects cases that are not indicative of the actual situation but present a very unfavorable picture. 
In Section 2 we discuss in detail the modeling of uncertainties and show that uniform sampling can give a 
very conservative evaluation of system robustness. In Section 3 we introduce a new sampling technique and 
a new robustness measure which overcomes the conservativeness issue. Section 4 establishes a one to one 
mapping between our measure and the existing one and considers other capabilities of the new robustness 
function. The detail algorithms are presented in Section 5. Section 6 addresses the issue of computational 
complexity for the evaluation of robustness function. In particular we show that by using a special type of 
hierarchial data structure it is possible to design computational algorithms that have a complexity that is 
independent of the dimension of the uncertainty. The proofs of theorems are given in the Appendix. 



In this section, we shall discuss the characteristics of uncertainty from the perspective of modelling 
practices. 

Consider an uncertain system shown in Figure [TJ In control engineering, one usually takes into account 
all possible directional information about the uncertainty by introducing weighting matrices and absorbing 
it into the generalized plant P. Therefore, it is reasonable to assume that the uncertainty A is radially 
symmetrical in distribution in the sense that, for any r > and any Sr Q {A : \ \A\ \ = r}, 
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Figure 1. Uncertain System 

if /||A||(-) is eontinuous at r. where the conditional probabihty in the left hand side is defined as 

j.^ Pr {A e {A GSp-. r - £i < p < r + £2}} 
siio Pi-ir - El < \\A\\ < r + 62} 

6210 L 1 — II M — 

On the other hand, one usually attempts to make the magnitude of modelling error, measured by \\A\\, 
as small as possible. Due to the effort to minimize \\A\\ in modelling, it is reasonable to assume that 
small modelling error is more likely than large modelling error. This gives rise to the rationale of treating 
||A|| as a random variable such that its density, /||a||(?') = '^^^'^^^^(^ ^^^''"^^ ' non-increasing with respect to 
r. In the sequel, we shall use ^ to denote the family of radially symmetrical and non-increasing density 
function /a- It should be noted that a wider class of probability density functions, denoted by has been 
proposed in [3] to model uncertainty. Such family consists of radially symmetrical density function 
that is non-increasing in the sense that f^{Ai) < /a(^2) if ||^i|| > ll^2||- It can be shown that is a 
superset of i.e., D ^ (see Lemma 2 in Appendix A). 

2.1. Existing Robustness Function. The existing robustness function, proposed in [3], is given by 

P(r) =^ inf F(p) 

pe{0,r] 

with 

P(r) Pr{I(^") = 1} 

where is uniformly distributed over Br- It has been shown in that P(r) is a lower bound of the 
probability of guaranteeing the robustness requirement if the density of uncertainty belongs to ^ and the 
uncertainty is bounded in Br- 

An attracting feature of the existing robustness function is that it relies on very mild assumptions about 
uncertainty. However, as can be seen from Theorem 6.1 (in page 856) of [3], the associated computational 
complexity can be very high for large uncertainty dimension. Another issue of the existing measure is that 
it can be very conservative from the perspective of modelling practices. For illustration of this point, we 
consider a conceptual example as follows. 

Suppose it is known that the norm of uncertainty A cannot exceed 7. Without loss of generality, assume 
7=1. That is, all instances of A are included in the bounding set B = {A : \\A\\ < 1}. We partition 
B as m layers Si = {A : r^_i < \\A\\ < ri}, £ = 1, 2, • • • , m by radii fi = ^ = 0, 1, • • • , m. From the 
consideration of modelling practices, it is reasonable to assume that the density of uncertainty A belongs 
to ^. Hence, for sufficiently large m, we have Pr{.^ e Si] > Pr{.^ e Si+i}, £ — 1,2, ■ ■ ■ ,m — 1. In 
reality, it is not impossible that not only the outer layers are "bad" and some inner layer is also "bad" . 
Such scenario is described as follows: 

The robustness requirement is violated for A €z Si and for A €z Se, £ — j, j + 1, ■ ■ ■ ,m where i and j are 
integers such that 2 < i + 1 < j < in. See Figure[2]for an illustration. Let d be the dimension of uncertainty 
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Figure 2. Conceptual Example (The robustness requirement is violated for red layers 
and is satisfied for green layers. Existing robustness measure tends to completely ignore 
uncertainty instances in the inner layers as d increases. Based on the existing robustness 
measure, a very thin bad layer may lead to an unrealistic judgement that the system has 
very poor robustness. However, the instances in the inner layers are more probably to 
occur in reality. Hence, they should have at least equal impact on the evaluation of system 
robustness as compared to the instances in the outer layer.) 



space. By direct computation, we obtain the existing robustness function as IP(r) = iiifpg[o.r] ^{p) where 



P(p) 



1, forp<r-i-i; 

T-Hd, for ri-1 < p < n; 

(3-l)''-i'' + (i-l)'^ 

( 7?l p ) 



for Ti < p < rj-i; 
for r-j-i < p < 1. 



Clearly, lim(i^ooP(r) = 1 for r < ri_i and limd^ooP{r) = for r^^i < r < 1. This indicates that the 
existing robustness function tends to be a discontinuous function as d increases. An undesirable feature 
of existing measure resulted from such discontinuity is that a very small variation in the knowledge of the 
uncertainty bound, 7, may lead to an opposite evaluation of the system robustness. 

For practical systems, large uncertainty instance is less probably while the robustness requirement is more 
likely to be violated for larger uncertainty instance. Consequently, unduly conservatism may be introduced 
if the uncertainty instances near the surface of uncertainty bounding sets assume a dominant role. This is 
indeed the case for the existing probabilistic robustness measure. This can be illustrated as follows. Suppose 
Pr{||.^|| < 7} = 1. For the existing measure, the corresponding density of 1 1 .^11 of the sampling distribution 

/ s d-l 

that determines P(7) is often times close to f\\A\\{r) = dl-p-j where p* = max{p : P(p) = IP(7)}- For 

p ~ p* , the probability that a sample falls into {A : p < \\A\\ < p*} is 1— (^-^^ which is very close to 
1 when the dimension d is high. This shows that the uncertainty instances near the surface of Bp* are 
dominating in the evaluation of system robustness. 



3. New Sampling Technique and Robustness Function 

We have shown before that uniform sampling in high dimensional sets suffers from a surface effect. 
In the following we introduce a new sampling technique that offsets such effect and we use the modified 
sampling technique to define the new robustness measure. 

3.0.1. A New Sampling Technique. To offset the surface effect for uncertainties with radial symmetry we 
define two independent random variables. One, U is uniformly distributed in the surface of the unit 
bounding set, {A : \ \A\ \ — 1}, in the uncertainty space. The second random variable is R which is a scalar 
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variable uniformly distributed over [0,r]. Clearly, for a given value of the scalar random variable R, the 
uncertainties lay on the surface of a ball and since R is scalar the surface effect is reduced. 

3.0.2. A New Robustness Function. Now that have established the sampling technique to be used, we define 
the robustness measure for the radius r as 

^(r) = inf ^(p) with ,^{r) = Pi{l{UR) = 1} 

pG(0, r] 

where U is a sample from U and R a sample from R. The probabilistic implication of such robustness 
measure can be seen from the following theorem. 

Theorem 1. For any robustness requirement, 

inf Pr{I(Zi) = 1\\\A\\< 7} -^(7) > lil)- 

See Appendix A for a proof. The intuition behind Theorem [T] is that, in the worst-case, the uncertainty 
instances in the inner layers should assume equal importance as that of uncertainty instances in the outer 
layers in the evaluation of system robustness. It should be noted that the density (.) can be unbounded 
and has infinitely many and arbitrarily distributed discontinuities. An example of unbounded density is 
/||/i||(p) = ^, k>l. 

Now we revisit the conceptual example discussed in Section 12.11 Our robustness function is ^(r) = 
infpe[o,r] ^{p) where 

1, forp<ri_i; 



— , for ri_i < p <ri\ 
IHPzlL for n < p < r ,_i: 
for r,_i < p < 1. 

- mp ' -f ^ — " 

As can be seen from Figure [3l our robustness measure is significantly less conservative than the existing 
one. 

4. Mapping of Robustness Functions 

In this section, we shall demonstrate that there exists a fundamental relationship between our robustness 
measure and the existing probabilistic robustness measure. This relationship can be exploited, for example, 
to reduce the computational complexity of existing probabilistic robustness measure. 

4.1. Integral Transforms. The following theorem shows that there exists an integral transform between 
our proposed robustness function and existing robustness function. 

Theorem 2. Define (j}{r) = Pr{I(rC/) = 1} where U is a random variable uniformly distributed over 
{A : \\A\\ = 1}. Suppose that the distribution of uncertainty A is radially symmetrical and that both 
f\\A\\{-) o,iT-d- 4>{-) are piece-wise continuous. Then, for any r > 0, 

^(r) = + / P rp dp, 

n n 

P(r) = n ^{r)-n{n~l) [ ^(rp) p""'^ dp 

Jo 

where n is the dimension of uncertainty space. 

See Appendix B for a proof. Theorem [5] shows that once one of ^{.) and P(.) is available from Monte 
Carlo simulation, the other can be obtained without simulation. 
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Figure 3. Comparison of Robustness Functions (m = 20, i = 11 and j = 19. The 
dimension of uncertainty space is d = 50, which is equivalent to a complex block of size 
5 X 5.) 



4.2. Recursive Computation. For a transform to be useful, we shall develop efficient method for its 
computation. The efficiency can be achieved by recursive computation. We first discuss the computation 
of transform from ^(.) to P(.). 

It can be seen that the expression of P(.) in terms of ^(.) is not amenable for recursive computation. By a 
change of variable, we rewrite the second equation of Theorem[3]as P(r) = n 3^{r)— "^"""'"^ /J^ ^{p) p"'~^dp. 
Clearly, the major computation is on the integration 7(r) = Jg ^{p) p"^^dp, which can be computed 
recursively because of the relationship I{r + h) = I{r) + J^^'^ 3^{p) p"^^dp. Unfortunately, there will be a 
numerical problem for computing the product x I{r) in the situation that n is large and r < 1. For 

example, "^"^"""^ can be a huge number and cause intolerable numerical error when n — 36 and r = 0.5. To 
overcome this problem, we derive the following recursive relationship 

Since ,^{.) can be approximated by a simple function, we can decompose "^^""^^2 J^^'^ £^{p) p"~^dp as a 
summation of integrations of the form -^(p) p"~^dp with = c, Vp G [a,/?]. Clearly, we have 

the explicit formula /f ^(p)p"-^dp = (n - l)c (^)" [(f)" - l] . 

In a similar manner, can be computed recursively by relationship 

n-1 1 r^^ , . , 

+ -r / np)dp. 

n r + h J^. 

5. Computational Algorithms and Hierarchial Sample Reuse 

In this section we shall discuss the evaluation of for uncertainty radius [f ,a] with sample size N 
and TO grid points j — fi < ■ ■ ■ < r„i — a. First, we shall introduce basic subroutines. Second, we present 



(r + h) 



^{r + h) 



(r) 



A STATISTICAL THEORY FOR THE ANALYSIS OF UNCERTAIN SYSTEMS 



7 



sample reuse algorithm based on sequential data merging method. Third, we shall demonstrate that the 
sequential sample reuse algorithm is impractical and propose hierarchy sample reuse algorithms. 

The basic idea of our algorithms is as follows. Let U'', k ^ 1, ■ ■ ■ , N he N i.i.d. samples uniformly 
generated from {Z\ : | |Z\| | = 1}. For i = 1, • • • , to, we can estimate 3^{ri) as ^''"^j^^ with Z\'''* U'^R'"'" 
where R'''^ is uniformly distributed over [0,ri] and is independent of f/'^ for fc = 1, • • • , TV. It should be 
noted that i?'^'', i = 1, • • • , to are not necessarily mutually independent to ensure that Z\'^'\ k = 1, - ■ ■ ,N 
are i.i.d samples. Due to the uniform distribution of i?*""'*, sample reuse techniques can be employed to save 
a substantial amount of computation for the generation of A^'^ and the evaluation of I(Z\'^-*) in the 
following manner. Let k be fixed. Let i? be a sample uniformly generated from interval [0,rp]. Then, for 
any index j such that rj G [R,rp], we can use R as R'^'^ , U^R as A'^'^ , and as I{A'''^). It can be 

shown that the minimum index j can be computed by explicit formula (|5.ip as 

(Afl-a)(m-] 



(5.1) 




a(A-l) 



for uniform gridding; 



(to, — 1) ^1 + "ETt)] ) ^'-'^ geometric gridding 



where "uniform gridding" means that — r^-i is the same for i = 2, • • • , m and "geometric gridding" 
means that is the same for i = 2, ■ ■ ■ , to. 

For a specific k, the sample [/'"' is referred to as a directional sample and the simulation with sample 
reuse techniques to obtain I(zi*''''), i ~ 1, - ■ ■ ,to is referred to as "Radial Sampling". Clearly, I(Z\'^'*), i = 
1, • • • , to can be expressed as a matrix Z? of 3 columns and random number of rows such that its i-th row 
[Dii, Di2, As] means that 

^ ifA3 = l; 
ifA3 = 

for Dii < j < Di2. The algorithm of "Radial Sampling" is formally described in Section [5TT1 

The process of obtaining the summation X^s^i T^{A'^'^), i = 1, • • • , to is accomplished by the subroutine 
"Merging" , which is described in Section 15.21 

5.1. Radial Sampling. For a directional sample U, the goal of radial sampling is to create a matrix 
D. The input of the subroutine "Radial Sampling" is U,X,a,m and the corresponding output is -D = 
RS(?7, A, a, to). The algorithm is presented as follows. 



I(Z\'^^^) 



• Let p ^ m and do the following. 

— Generate a sample R uniformly from [0, rp\. 

— Lei A^ UR and evaluate 1{A). 

— Determine the smallest index j such that rj > Rhy (|5.ip . 

— Let D ^ [j, p, I{A)] and s ^ 1{A). 

— Let p ^ j — 1. 

• While p > do the following. 

— Generate a sample R uniformly from [0, rp\. 

— Let zi ^ UR and evaluate 1{A). 

— Determine the smallest index j such that r > i? by (|5.ip . 

— If 1{A) 7^ s, add [j, p, I(/\)] to D as the first row and let s ^ ^{A). Otherwise, update the 
first element of the first row of D as j. 

— Let p <— j — 1. 

• Return D as the outcome of radial sampling. 
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5.2. Merging. The operation of merging involves two matrices D and H. Matrix D defines a segmented 
function over domain {1, • • • , to} in the sense that, for the j-th row of D, foii) = Dj^ for any i such 
that Dji < i < Dj2. Similarly, matrix H defines a segmented function over domain {1, • • ■ , to} in 
the sense that, for the j-th row of H, fnii) ~ Hj^ for any i such that Hji < i < Hj2- For input matrices 
D and H, the merging operation finds M = Merge(£', H) such that 

fAiii) = foii) + .fH{i), i = 1, • • • , m 

where fhii-) is a segmented function fAii-) over domain {1, • • • ,to} in the sense that, for the j-th row of 
Af , = Alj^ for any i such that Mji < i < Mj2- 

5.3. Sequential Sample Reuse Algorithm (SSRA). The sequential algorithm derives its name from 
the sequential nature of the data merging process. The input variable is N,X,a,m and the output is a 
matrix H of random number of rows and 3 columns. The main algorithm is presented as follows. 



• Let k ^ 1 and do the following. 

— Generate a directional sample U. 

— Perform radial sampling and let D <— RS(?7, A, a, to). 

— Let H ^ D. 

• While A: < do the following. 

— Generate a directional sample U . 

— Perform radial sampling and let D ^ RS(t/, A, a, to). 

— Perform merging and let H ^ Merge(£', H). 

— Let ^ fc + 1. 

• Return H . 



Once we have H from the execution of SSRA, we can estimate 3^{ri) as ^"^j^ = jy , i = 

1, - • • ,to. 

5.4. Hierarchy Sample Reuse Algorithm (HSRA). A major problem with the sequential algorithm 
is that the computational effort devoted to merging becomes an enormous burden as the sample size N 
becomes large. 

The merging time for N = 1000, 5,000, 10,000 and 50,000 are respectively 4, 120, 722 and 92119 
seconds, which is obtained by simulation on a PC of 1024M RAM and 3.2G CPU. As can be seen from 
Figure m the merging time required for N = 10^, 10^ and A = 5 x 10^ is predicted respectively as, 12 
days, 366 years, and 9 x 10^ years, by fitting the simulation data into a quadratic function (in log scale) 
based on regression techniques. For a better understanding of the complexity issue, a theoretical analysis 
of the computational complexity of data merging is as follows. 

From the merging process, it can be seen that the computational complexity of merging two matrices can 
be quantified by the sum of the numbers of the rows of the two input matrices. Thus, it suffices to study 
how the number of rows is growing when matrices D'^ ~ RS(?7'^, A, a, to), k — 1, • • ■ , N are sequentially 
merged. 

Note that the average numbers of rows for all D'^ are identical. Let this average be L. To merge 
with , the required computation is 2L. The computation to merge the outcome with is 3L. The 
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10^ lO'* lo'^ 10'' 10^ 

Sample Size 



Figure 4. Merging Time 




N= 8 



Figure 5. Illustration of Successive Binary Merging with 8. 

computation for all steps of merging forms a series, 2L, 3L, • • • , NL^ of constant increment L. Hence, the 
total number of computation is ^t^+^KJv-i) ^ ^his can be a huge number because N is usually large. 

To overcome the difficulty of sequential algorithm, we propose a merging method of hierarchy structure. 
We first introduce a subroutine called successive binary merging for N ~ 2^ data matrices as follows. 

Divide these N matrices ,■ ■ ■ , into ^ groups so that each group has two matrices. After merging 
each group, wc have raatrices. Repeating the operations of dividing and merging, we obtain a matrix 
in the final stage. This process can be associated with a binary tree as illustrated by Figure [5l 

For the general case that N is not a power of 2, we decompose TV as a summation of numbers which 
are powers of 2. For example, for N = 1000, we have iV = 512 + 256 + 128 + 64 + 32 + 8. Such 
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Figure 6. Merging with TV = 1000. 



decomposition corresponds to the decimal-to-binary conversion. In general, for N = J2'e=i with = 2^' 
and A^i < N2 < • • • < iV^, the merging can be performed as follows. 



• Let £ <— 1. Applying successive binary merging to A^i to create data matrix Mi. Let H ^ AIi. 

• While £ < r do the following. 

- Applying successive binary merging to Ni to create data matrix Mi. 

- Let H ^ Merge(i7, Mg). 

- Let £ + 1. 



The merging for N = 1000 is shown by Figure [6l 

The complexity of such hierarchy can be analyzed as follows. For successive binary merging with = 2^, 
the computation is p x NL. For A^ ~ X]I=i the computation is bounded by Lj2'i=i ^e^og2{Ni) + 
LJ2'e=i{T^(i+l)Ne — LNi. Therefore, the computation is reduced from the sequential algorithm by a factor 
of T = iN+2)(N 1) Specially, for A^ = 2^, we have T - > ^r^^ 

2[EJ=i iV{log2(JV«)+EJ^i(T-f+l)iVf-iViJ ^ ' 21og2(JV) 21og2(Ar)' 

which is usually a very large number. 



6. Computational Complexity 

In this section, we discuss the computational complexity for the evaluation of ^{.) over uncertainty 
radius interval (j,a]. For practical designs, the robustness requirement is guaranteed for the nominal 
model. Hence, .^(p) = 1 for small p, and we have infpg(Q ^{p) = infpg^a £P{p) for a sufficiently 
large A. A direct Monte Carlo simulation method is to partition the interval (^,a] by m grid points 
J ~ ri < ■ ■ ■ < Tm = a and estimate ^(ri) by A^ i.i.d. Monte Carlo simulations. The estimate of 
infpg( o J^{p) is obtained by taking the minimum of the results for the m grid points. Such direct method 
requires mN simulations. As m gets large, the computing time and the memory complexity becomes a 
challenging problem. Fortunately, by employing our hierarchy sample reuse algorithms, the computational 
complexity is absolutely bounded and very low for arbitrarily dense griding and arbitrarily large dimension 
of uncertainty. 



A STATISTICAL THEORY FOR THE ANALYSIS OF UNCERTAIN SYSTEMS 



11 



For quantifying the computational complexity, we define the equivalent number of grid points, m^q as 
the ratio 

Average total number of simulations 

meq = . 

We shall interpolate the value of ^(r) for r e [r,;,ri+i] as 

^ (r-r-,) ^(r-,+ i) + (r-,+ i -r) ^(r,) 

Ti+l - Ti 



For a uniform gridding, we have 
Theorem 3. Let e G (0, 1) and m = 2 



2(A-1) 



Let r, = f + for i = 1, • • • , m. Then, 



\^{r)- ^*{r)\ <e, Vreh,r 



i+ll 



for i ~ 1, - ■ ■ , m — 1. Moreover, mcq(e) = m — Y^'^L^^ ( 1 — ^ . ) < 1 + In A for any e G (0, 1) 



A-l 



See Appendix C for a proof. For a geometric gridding, we have 



Theorem 4. Let e G (0, 1) and m = 2 + 



Let ri ~ a (i) " ^ /or i = 1, • • • , m. Then, 



|^(r) - ^*(r)| < e, Vr G [r,, r,;+i] 

/or i = 1, • • • , m — 1. Moreover, moq(e) = 1 + (m — 1) — (i) "^^^ j < 1 + In A /or an?/ e G (0, 1). 

See Appendix C for a proof. For completeness, we note that, for arbitrarily large to, the memory 
complexity is also absolutely bounded and independent of uncertainty dimension. 

To compare the computational complexity of our probabilistic measure with that of [3], we recall The- 
orem 6.1 of [3], which states that if 

(6.1) m>l + -^ '— 

e 

then |P(r) — P(ri)| < e Vr G [ri,ri-^i] for i = 1, • • • , to — 1. This bound shows that, for fixed error e, 
the complexity is polynomial. From another perspective, it also shows that the number of grid points and 
computational complexity tend to infinity as the tolerance tends to zero. The computational complexity 
can be reduced by the sample reuse techniques of [5] . It is recently shown in [7] that the equivalent number 
of grid points is bounded by 1 + rflnA (see Appendix C for a proof). In applications, d can be very large. 
For example, the dimension d is 2n^ for a complex block of size n x n. Since the complexity of computing 
^(.) is independent of dimension d, the integral transform can be applied to obtain P(.) from ^(.) and 
thus significantly reduced the computational complexity. 

7. Examples 

In this section, we shall demonstrate the power of our techniques by examples. By the definition of the 
indicator function I(.), for N i.i.d. samples Z\i, • • • , Ajy generated from Br, 



MA) = 



1 if the robustness requirement is satisfied for Ai 
otherwise. 
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Specially, for the robustness stability problem in the M — A setup with M(s) = C{sl — A) ^B, 

1(A) 



1 if A + Bzl.Cis stable; 
otherwise. 



Of course, the N samples are obtained by the HSRA. A minimum variance unbiased estimator of ,?3^(r) is 
taken as ^(r) = ^'=^ . Since I(zii), i = 1, ■ ■ ■ , N are i.i.d. Bernoulli random variables with a success 
probability ^(r), the Chernoff bound [8] asserts that, for any e, (5 e (0, 1), Pr | 

if the sample size N > 

In all examples, we first apply our previous method in [6] to obtain an estimate of the probabilistic 
margin with a risk probability a = 0.05 (Roughly speaking, we are only interested in the curve of robustness 
function above 1 — a — 0.95). Then, we evaluate the robustness function ^(r) for r G [|^, a] by our hierarchy 
sample reuse algorithms. The existing robustness measure is computed from our measure by the integral 
transform. The algorithms are implemented in MATLAB and all programs are executed on a PC of 1024M 
RAM and 3.2G CPU. 

We first consider the case that the uncertainty is of a single block. A typical robustness problem is to 
determine the robustness margin which is specified as the maximum size of uncertainty under the condition 
that all poles of the closed- loop system are restricted in a certain domain Cg. For single blocked uncertainty, 
there exists formulas for computation of the robustness margin in a M—A setup with M (s) = C{sI—A)^^B 
(see, e.g., jl6j for illustration). For complex uncertainty, the robustness margin is 

1 



ml{a{A) : A e C'"''^ and all eigenvalues of .4 + BAC are in Cg} 



sup,eac,^[C(s/-A)-iS] 



where dCg denotes the boundary of domain Cg . This formula was essentially obtained by Doyle and Stein 
[9]. For real uncertainty, the robustness margin is 

TM = M{W{A) : A € W^'P and aU eigenvalues of A + BAC are in Cg} 

1 



sup,,6ac„ inf7e(o.i] (^2 



3?(M) -7 3(A/) 
7-1 3(M) n{M) 



where the function to be minimized is a unimodal function on (0, 1]. This formula was established by Qiu 
and his coworkers [TB] . 

To compare the power of our randomized algorithms with that of these formulas, we revisit two examples 
of [13]. In example 2 of [13], the domain Cg is defined as Cg = {s G C : 3?(s) < 0}. The data of matrices 
A, B, C can be found in page 889 and is thus omitted here. The robustness margins for the complex and 
real uncertainty are obtained, respectively, as rc = 0.3914 and rjj = 0.5141. The robustness functions are 
shown in Figures [7] and [8] for the cases of complex and real uncertainty respectively. It can be seen that our 
randomized algorithms can provide useful information for the system robustness beyond the deterministic 
robustness margin. Specially, the deterministic robustness margin can be estimated from both types of 
robustness functions. Moreover, it can be seen that our robustness measure is significantly less conservative 
than the existing robustness measure. 

In example 3 of [13], the domain Cg is defined as Cg = {s € C : |s| < 1} and the data of matrices 
A, B, C are given in page 889. The robustness margins for the complex and real uncertainty are obtained 
as rc = 0.7472 and tr = 1.0374 respectively. The robustness functions are shown in Figures [51 and [TUl for 
the cases of complex and real uncertainty respectively. 
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■ Existing measure from transform 
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0.3 0.4 0.5 0.6 0.7 

Uncertainty Radius 
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0.9 



Figure 7. Robustness Functions (Sample Size N = 26482). The vertical line marks the 
deterministic robustness margin. 




Figure 8. Robustness Functions (Sample Size N ~ 26482). The vertical Une marks the 
deterministic robustness margin. 
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■ Existing measure from transform 
' Our measure 
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Figure 9. Robustness Functions (Sample Size N = 26482). The vertical line marks the 
deterministic robustness margin. 




Figure 10. Robustness Functions (Sample Size N — 26482). The vertical line marks the 
deterministic robustness margin. 
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Wc now consider the stability margin problem where the uncertainty consists of multiple blocks. A par- 
ticularly important special case is that the uncertainty is real parameters. When the number of uncertainty 
blocks is more than one. the formulas of [9] and [13] are not applicable and the branch and techniques 
are needed. We explore the application of our HSRA for the stability margin problem studied in [lOj by 
a deterministic approach. The system considered in [10] is represented by Figure [TT] The compensator is 
= j+w Pl^^^* i'^* ^('^^ ^ sis+i+o°2S2)ls+6+o.3s-)) "^^i*^^ parametric uncertainty A = [61,62,63]. 



C(s) 



P(s) 



Figure 11. Uncertain System 

The deterministic robustness margin is found to be 3.44 by a branch and bound technique (see, page 
163 of |10|). The robustness functions are shown in Figure [T^ which provides more insight for the system 
robustness than the deterministic robustness margin. 



0.98 



0.96 



CO 

JD 
O 

a. 

H — 

o 

E 
=3 
E 



0.94 



0.92 



0.9 



0.88 



0.86^ 
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Figure 12. Robustness Functions (Sample size N = 119,830.) 

We now consider the robustness problem involving time-domain specifications for the same system shown 
by Figure 1111 The robustness requirement is that the rise time and settling time should be no more than 
0.25 and 3.5 seconds respectively and the overshoot should be no more than 70% under the condition that 
the closed-loop system is stable. It is well-known that this type of problems are, in general, intractable by 
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the deterministic approach. However, our HSRA can readily provided insightful solution. The robustness 
functions are shown in Figure [T31 
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Figure 13. Robustness Functions (Sample size N = 26482.) 

Now we present more extensive numerical experiments for testing the efficiency of our hierarchy sample 
reuse algorithms. We consider the robust stability of a system of transfer function H{s) = C{sl—A)^^ B+D 
with uncertain matrix A = —10 Ikxk + J2e=i 1^ ^ where Ikxk is a fc by k identity matrix, d = 
is the dimension of uncertainty and is a matrix with all elements equal to 1. This is a special case of 
multiple blocks of real uncertainty. Although this may not be a realistic system, it can be representative 
for realistic systems in the respect of computational complexity. 

When the size of matrix A increases from 2 to 10, the dimension of uncertainty increases from 4 to 100. 
The robustness functions for the case that A is of size 10 x 10 is shown in Figure[T31 The computing time is 
shown in Figure [T5] for various problem sizes. The sample size is chosen by the Chernoff bound A*' = ["-^^j 
as 738, 26482, 119830, 3800452 corresponding to e = 6 ^ 0.05, 0.01, 0.005, 0.001 respectively. 

Traditionally, it is widely believed that the classical deterministic robustness analysis are usually more 
efficient than randomized algorithms. However, as can be seen from Figure [T51 our numerical experiments 
indicates that, if one is willing to accept our probabilistic robustness measure, the robustness analysis via 
hierarchy sample reuse algorithms can be generally far more efficient. 

8. Conclusion 

In this paper, we develop a new statistical approach for robustness analysis which requires an extremely 
low complexity that is independent of the dimension of uncertainty space. Our proposed robustness mea- 
sure is less conservative as compared to the existing probabilistic robustness measure. The fundamental 
connection between our measure and the existing one is also established. 
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Figure 14. Robustness Functions (Dimension d = 100. Sample size N = 119,830.) 
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Appendix A. Proof of Theorem 1 
The following Lemma [1] is due to [3] . 

Lemma 1. For any robustness requirement, inf/^g^ Pr{I(.^) = 1 | ||.^|| < 7} =P(7). 

Lemma 2. is a superset of J^, i.e., D ^ . 

Proof. Let j a. G ^ ■ We need to show j a. G Let < ri < r2 be two numbers such that, for any 
A\, A2 satisfying \\A\\\ — r\, \\A2\\ ~ r-i, both j ^(^A\) and //x(/\2) exist. By the radial symmetry of the 
distribution of we can write f.^{Ai) as g{ri) for i = 1, 2. Clearly, the existence implies that is 
continuous at r = r,j, i = 1,2. Let c = dv. By the radial symmetry of the distribution of ^ and the 

scaling property of the function ||.||, we have = linie^o ^ Jr'-e 9^P^ ncp^~^ dp for z = 1, 2, where 

n is the dimension of A. Hence, is continuous at r = r^, i ~ 1,2. Recall that /a. G we have 

(''1) ^ (''2)- On the other hand, by the radial symmetry of the distribution of z\ and the scaling 

property of the function ||.||, we have g{ri) = lim^io ctjr.^'ey^-c(r -e)" * ~ continuity of 
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/ll^ll(r) at r., we have g{n) = ^fi^ for ^ = 1, 2. It follows that ^ = (^)""' > (l^)""' > 

1, implying that e ^. Hence, ^ D ^. 

□ 



Lemma 3. For any S C dB, area(5'r) = ^ area(5') where Sr = {rA : A ^ S} and n is the dimension 
ofB. 

Proof. By the scalable property of 1 1 . 1 1 , 

A 



I^Zi : r - £i < p < r + £2, A e Sr^ = {pA : r - £i < p < r + £2, A € S} = ^A : r - ei < p < r + £2, — ^ ^ 

f ^ < < + ^€S\^^ 

Hence, by invoking the definition (11.11) . arca(5r) = linieiio — — r — — - — ■ Making a change 



of variable q = rq' yields 



area(^.) = ^^'^{^--<^-+-^ ^^4 



= ilo El + S2 

e2i0 

r lim — — - — 

:ii°o {£i+e2)/r 

^„_l Iq'G{pA: -ei<p-l<£2. /^gg} 

^ifo £1+62 

r"-iarea(5'). 

□ 



Lemma 4. Suppose the distribution of ^ is radially symmetrical. Let S be a subset of dB = {A : ||Z\|| = 1}. 



Then, Pr | S S 



= p| = arca{dB) ^'^2/ P > SMc/i that f\\^\\{p) is continuous. 



Proof. By the definition of the conditional probability. 



Pr 



G S 



El 10 



Pr{^e^, p-£i < ll^ll <p + £2} 
Pr{p-ei < ll^ll < ^ + £2} ■ 



We claim that 



G 5, p - £1 < < p + £2} = e 5p,ei,e2} where Sp,e^,e, ={A:feS, p-£i < 



p' < p+£2}. To show this claim, it suffices to show that |z\ : G S*, p — £i < ||Z\|| < p + £2| 



Let Z\ G S'p^ej^gj. By definition, there exists p' G [p — £i,p + £2] such that ^ G 5. Therefore, by the 

= p' G [p - £i,p + £2] and 



we have ||Z\| 



p'4 



scalable property of the function 

^ 7 ^ ^- This implies that Z\ G {z\ : ^ G ^, p - £1 < ||Zi|| < p + £2}. 

Now let A e ^A: e S, p - £1 < ||Z\|| < p + £2| and p' = ||Z\||. By definition, p - £1 < p' < 
p + £2, y £ S. Hence, A G Sp^ei,e2- The claim is thus proved and we have 



Pr 



G5, p-£l < ||Zi|| <p + £2 =Pr{/iG5p,ei,S2} 



Let Sp, = {p'A : Z\ G S*}. Then, S'p/ C dBpi and Sp,^!,^^ = {Z\ : Z\ G S'p', p - £1 < p' < p + £2}. By 
the notion of the radially symmetrical distribution of and the property of the area function shown in 
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LemmaEl we have Pr{^ e Sp, \ \\A\\ = p'} = = p'"-'Z7{aB) = ^^M)- ^he other hand, by 

the definition of the conditional probabihty, 

Pr{A e Sp, \\\A\\= p'} = hm ^ , ^'^^ ^m^.T"'^ 7- 

^ ^ 'ii«Pr{p-ei<\\A\\< p + e2} 



It follows that Pr I 1^ e 



Pr{/\gSp,,,,,,} _ arca(S) 
^2i0 Pr{p-ei<ll^ll<P+e2} ~ arca(a6) ' 



□ 



area(S') 

arca(c?B) 



Lemma 5. Suppose /||^||(.) is continuous in {a,b). Then, Pr|y|^|j (z S \ a < \\A\\ < &| 

Proof. Let 77 > and S G (O,^^). For notational simplicity, let c — By Le 

jO e [a + 5, — (5], we can find e = e{p) such that Pr 1 S \ p — ei < \ \A\ \ < p + 
positive Si, £2 less than e{p). Hence, the union of the open intervals Upe[a+5.b-5]ip~^ip)i P+£(p)) will cover 
interval [a + S,b — S]. By the finite coverage theorem, we can choose finite number of pi from [a + d,b — d] such 
that ^i^iipi — £{pi), Pi+^iPi)) covers interval [a + 6,b— 6] and that none of {pi — e{pi),pi+e{pi)) is nested in 
another. By using the mid-points of the intersections of every two consecutive intervals as dividing points, 
we can partition [a + S,b — 5] as /c intervals [a^, bi] such that Pr 1 G S* | fli < ||/^|| < ^ c < 77 for 



i — 1, • ■ ■ , fc. Therefore, 
for i = 1, - ■ ■ , fc and 

A; 



E 



Pr 



Pr{^ e 5, a, < \\A\\ < 6,} - cPr{a, < ||^|| < b,}\ < ^^Pr{a, < \\A\\ < b,} 



< 77^Pr{a, < \\A\\ < b,} 



e S, a, < ll^ll < b, ^ -cPr{a, < \\A\\ < b,} 



That is, |Pr||^ e S*, a + (5 < < 6 - ^1 - cPr{a + (5 < < 6-(5}| < r;Pr{a + (5 < < 6-(5}. 



As a result, 



Pr{j^^^ eS\a + 6<\\A\\<b-d} 



Pr 



< rj. Since 77 can be arbitrarily small, we have 
€ S, a + S <\\A\ \ <b- s\ = cPT{a + S <\\A\\ <b- S} . 



By the assumption that f\\^\\{.) is piece-wise continuous, we have Pr{p < ||.^|| < p + S} ^ Q as 5 I Q ior 
all p > 0. Hence, 



lim 

= lim 



Pr 



Pr 



1^1 
A 



e 5, a + S< \\A\\ <b-S}-Pr 



e S, a< \\A\\ < a + S 



Pr 



e 5, a < \\A\\ < b 



eS, b~S< \\A\\ < b 



< lim[Pr{a < \\A\\ < a + S} +Pr{b- S < \\A\\ < b}] ^ 0, 

and so lim5xoPr{y|3|| & S, a + S < \\A\\ < 6 - = Pr{y|^ e S, a < \\A\\ < by Similarly, 

lim|Pr{a + (5< \\A\\ < b - 5} - Pr {a < \\A\\ < b}\ 

SIO 

= lim[Pr{a< \\A\\ < a + S} + Py {b - 5 < \\A\\ < b}] =0, 

i5iO 

and so lim^^o PT{a + S < \\A\ \ < b - S} = Pr {a < \\A\\ <b}. It follows that 



Pr 



eS, a< \\A\\ <b}= cPi{a < \\A\\ < b} . 



This completes the proof. 



□ 
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Lemma 6. Suppose that the distribution of ^ is radially symmetrical and that /||^||(.) is piece-wise 
continuous over (0, oo). Then, -[-^jj- is independent with ||.'^||. Moreover, | | ^ | is uniformly distributed 
over {A : ||Z\|| 1}. 

Proof. Since /||^||(.) is piece-wise continuous over (0,00), we can represent (0,oo) as a union of open 
intervals (a^, bi) where is continuous and the set of discrete values pj, j = 1, 2, • • • for which f\\A\\{-) 

is discontinuous. We can enumerate the intervals and the discrete values such that bi — a-i is non- increasing 
with respect to i and that pj — pj-i is non- increasing with respect to j. Then. Pr | £ S, 1 | = j> = 
0, j = 1, 2, • • • and, by LemmaO 



Pr 



4 - ep 



A 

1'^ Tv-TTT e 5", a; < < b. 



e^, 11^11 = P.- 



area(S') 
area(3S) 



^Pr{a. < < 6J + ^Pr{||^|Hp,} 



area(S') 
area(i9K) ' 



Therefore, invoking Lemma |4l we have Pr |yf3u G S \ \\^\ \ = p| = Pr G for any p such that 

f\\A\\{-) is continuous. This implies the independence between yi^yj said \ \^\\. Moreover, since the argument 
holds for any S C {A : \\A\\ = 1}, we have that is uniformly distributed over {A : \\A\\ = 1}. The 
proof is thus completed. 

□ 



Lemma 7. Suppose that 0(.) is continuous over (a, b) and that the distribution of uncertainty A is radially 
symmetrical and continuous over (a, 6). ThenPr{l{A) = l, a < \\A\\ < b} = jy{r)fR{r)dr. 

Proof. Define U — jj^jj, R = ll-^H and fnip) — f^ElMSEil, gy Lemma [6l we have that U and R are 
independent and that U is uniform over dB. Hence, the probability density function of C/i? is ai.ea(3B) ^ fni''') 
and, by the Fubini's Theorem, 



Pr{I{A) ^ I, a < \\A\\ < b} = Pr{I([/i?) = 1, a < i? < 6} 



r=a J{u: i(rti)=i, uedB} area(9S) 



/i?,(r) dudr 



r—a 
b 



1 



{u: i(r«)=i, uedB} area(9S) 
<l>{r)fR{r)dr 



du 



fR{r)dr 



where the last equality follows from the definition of i 



□ 



Lemma 8. Suppose that (p{.) is piece-wise continuous and that f\\^\\{.) is piece-wise continuous and non- 
increasing. Then, ¥v{l{A) = 1, ||Zl|| < 7} Q (t>{p) f\\^\\{p)dp. 
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Proof. Let £ > 0. Since f\\^\\{p) is non-increasing, we have f\\^\\{p) < ^'^H^H-*^^ for p g [e,p]- It 
follows that (j){p) f\\A\\{p) is piece-wise continuous and bounded for p e [£,p\- Hence, the Riemann integral 
Hp) f\\A\\{p)dp exists. Note that (t){p) f\\^\\{p)dp < JJ f\\A\\{p)dp < 1 and that 0(p) f\\A\\{p)dp 
is non-increasing with respect to e. Thus, linieio fj 4'{p) f\\A\\{p)dp exists. This limit is denoted as 
So 4>{p) f\\A\\{p)dp. 

Note that we can partition interval (0,7) as a sequence of intervals {ai,bi), i 1, - • • ,00 such that 
tti, hi, i = 1, 2, • • • arc discontinuities of f\\A\\{p) and that 5, — ai is non-increasing with respect to z. To 
ensure that the partition is unique, we can handle the situation that some intervals have the same length 
by enforcing the following criterion: if 6^ — = bj — aj, i < j then < aj. Then, by the property of 
the Riemann integral, we have 4'{p) f\\A\\{p)dp = X^i^i la' Hp) f\\A\\{p)dp- On the other hand, since 
Pr{||/\|| = a,} Pr{||/l|| = 6J = for i = 1, 2 • • • , cx), we have 

OQ 

Pr{I(Zi) = l, ||Zi|| <7} = Y.^r{l{A) = l, a,<\\A\\<h} 

i=l 

(A.l) = / Hp)hMip)dP 

1=1 
7 

(l>{p) f\\A\\{p)dp 
where the equality (jA.ip follows from Lemma [71 

□ 




Lemma 9. For any r > 0, 0^{r) ^ j, JJ (f){p) dp. 

Proof. By the definition of ^(.), we have ^(r) = Pr{I([/i?) = 1} = Pr{I(;7i?) = 1, \\UR\\ < r} 
where U and R are independent random variables such that U is uniformly distributed over dB and 
R is uniformly distributed over [0,r]. Applying Lemma [5] to random variable A — UR, we have ^(r) = 

Jo Hp) fnip) dp^l S^ Hp) dp. 

□ 



Lemma 10. Let < ri < r2. Then, \.9'{r2) - -^(ri)| < 
Proof. By Lemma [9l 

I^(r2)-^(ri)| = 
< 



2(r2 — ri ) 



/r? Hp) dp 



f2 

m Hp) dp 

r2 



1 1 



r2 ri J Jq 



1 1 



Ti r2 J Jo 



< 1 ri 



T2 



r\r2 



2(r2 -ri) ^ 2(r2-ri) 



r2 



Hp) dp 



Hp) dp 



where we have used the fact that < Hp) — 1- 



□ 



Lemma 11. info<p<T ^(p) = info<p<-y S^{p) where Q denotes the set of all rational numbers. 
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Proof. Let a = info<p<T ^{p) and b = info<p<-y £^{p)- Clearly, a > 5 > 0. Suppose a > b. Then, there 

exists a real number p* G (0,7] such that 3^{p*) < By the dense property of the rational numbers, 

for any S G (0, p*), there exists a number 6 such that ^ £ Q and that \6 — p*\ < S. Thus, by Lemma [TOl 
1^(61) - ^(p*) I < -2^, leading to ^{6) <^{p*) + + Since S can be arbitrarily small, 

we have ^{9) < Hence, a < i.e., a < b, contradicting to a > 6. This shows that a > 5 is not 

true. Therefore, a = b. 

□ 



We are now in the position to prove Theorem 1 . For every (z define /| | a 1 1 (p, 7) ~ ^''^'^^'^^^ ^ ll^li<'r} ^ 



Then, f\\A.\\{p,l) = Pr{||^||<7} ^''^'j^^''-^^ = p/{'|'|"^'||<^} , and the set of all such functions constitute a 
family of conditional density functions, denoted by Clearly, every conditional density f\\^\\{p, 7) in 
is non-increasing with respect to p. For every positive integer k, we use ^-y.fc to denote the set of conditional 



density functions of the form: f\\A\\ip,l) = 'J2i=i I{ri_i.ri]ip), Vp G (0,7] where : 



k ' 



0,1,- 



, k, 



-f(n-i,r.](a;) 



1 if X e (r,_i, n]; 
otherwise 



and ^1 > 6 > 

Pt{I{A) = 1 
Therefore, 
(A.2) 



> & > with ^ ^^ = 1- By LcmmaEl 



l^ll<7} = 



Fr{I{A) = l,\\A\\<j}_ fi\A\\{p)dp 



Pr{||Z^|| <7} 



Pi-{\\A\\<^} 



(t>{p)f\\A\\{p-,l)dp. 



inf Pr{I(/\) 



l|||^ll<7} 



inf 

/||^||(-.7)6^-, JO 



(l>{p)f\\A\\{p.l)dp. 



Since 4){p) I(^ri-i,ri]ip) is bounded and piece-wise continuous over (0,7], it is Riemann integrable. It follows 
that, for a conditional density f\\A\\iPi7) in the family ^j^k, 



Mf\\A\\{p.l)dp = 



dp = y2 / Hp) dp = V 



where = (j){p) (p) dp for i = 1, • • • , fc. Since a; is independent of (^1, • • • , ^/s) for i = 1, • • • , /c, 

we have that 2Zi=i is a linear function of ^i, i ~ 1, ■ ■ ■ ,k for any given fc > 0. Therefore, the infimum 
infjii^ii ( fc Jo 't'{p)f\\^\\{Pjl)dp equals to the minimum of X^iLi subject to the constraint that 

6 > 6 > • ■ ■ > 6 > and ^ Y,,=i S.i = 1- Note that the minimum of a linear program over a bounded 
set is achieved at the extreme points. By Lemma 2.2 of [5], for every extreme point of the convex set 
{(Ci,-- - ,S.k) ■■ ^1 > S.2 > ■ ■ ■ > (.k > 0, ^ELiC* = 1}' we can find an integer £ such that = ^ 
for i = 1, • • • ,^ and = for i = £ + 1, ■ ■ ■ , k. For such extreme point associated with £, we have 
J2i=i — Jo 4>{p}f\\/^\\{Pj l)dp — Jq'^ (j){p)^dp — (£7) , where the last equality follows from Lemma 
H Therefore, 



inf 

/ii/iii(-.7)e.^^.fc 



It follows that 



inf 

/||zil|(-,7)eu~ 



Hp)!\\A\\{pri)dp = min 1^ (^-7j : < ^ < /c 
^ ,/^'^('°)-^II^N('°'^)'^'' = i'^^U {-^ :0<^<fc|=inf|.^(p):0<p<7, ^ ^ q} 
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It can be shown that 

f\\/\\\{-,i)(^^T=i^i.k Jo /iiziii(-,7)e,^T Jo 

Hence, by (|A.2|) . 

mf Pr{I(Zi) = 1 I ll^ll < 7} = mf : < p < 7, - e q1 = inf ^(p), 

/^G^ t 7 J 0<P<7 

where the last equahty follows from Lemma [TlJ Finally, by Lemma [T] and Lemma [5J we have ^^(7) = 
inf/^6^Pr{I(z\) = 1 I \\A\\ < 7} > mf/^e<^ Pr{I(/l) = 1 | < 7} = P(7). The proof is thus 

completed. 

Appendix B. Proof of Theorem 2 

We shall first define some terminologies that will be used in the proof. 

Definition 1. A value of the uncertainty radius is said to be a discontinuity if </>(.) is discontinuous for 
that value. 

Definition 2. An open interval (a, b) is said to be a continuous interval if (f>{r) is continuous for any 
r G (a, 5). 

Definition 3. A discontinuity, p, is said to be a cluster point if, for any e > 0, there exists another 
discontinuity, q, such that |p — < e. 

The proof of the transform formulas is largely focused on the investigation of discontinuities, cluster 
points and continuous intervals. By the assumption that is piece-wise continuous, we can see that the 
distributions of discontinuities and cluster points can be arbitrary. For example, it is possible that there are 
infinitely many discontinuities distributed over (0, r) as where i = 1, • • • ,00 and j = 1, • • • , 00. 

In this example, there are infinitely many cluster points ^^-j-, i ~ 1, - ■ ■ , 00. 

Despite the complexity of the distributions of discontinuities and cluster points, it suffices to prove the 
transform formulas for the following four cases: 

Case (1): There are a finite number of discontinuities. 

Case (2): There are infinitely many discontinuities such that r = is the unique cluster point. 

Case (3): There are infinitely many discontinuities such that there is a cluster point at r = and 
that there is at least one more cluster point at r > 0. 

Case (4): There are infinitely many discontinuities such that there is no cluster point at r = 0. 
Before addressing each case in details, we need to establish some preliminary results. 
The following lemma is on the enumeration and classification of continuous intervals. 

Lemma 12. For any e > 0, the set of all eontinuous intervals defined by the end points q,r or discontinu- 
ities of interval {q,r) can be divided into two classes such that i) the first class, denoted by has a finite 
number of intervals; ii) the second class, denoted by .y^, has infinitely many intervals and the total length 
is less than e. 

Proof. Such classification can be performed as follows. Let k = 1 and Ck = jrr- Find all intervals with 
length greater than Ck- Rank these intervals by the lengths and include it in set Include the remaining 
intervals in set Increment k and update Ck = From ^ find all intervals with length greater than 
Cfc. Add these intervals to set and rank all intervals by the lengths. Eliminate those intervals from set 
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Repeating these steps for infinitely many values of k leads to a sequence of intervals of decreasing lengths. 
Let (oi, bi), 1 = 1,2, ••• denote this sequence, het Li ~ bi — ai. Then, ^.^j^ = (7 and is decreasing 
with respect to i. Thus, by Cauchy's theorem, there must be an integer K such that X^i^iv < ^- This 
implies that we have the desired two classes. The first class consists of intervals (a^, bi), i — 1, - ■ ■ ,K — I 
and the second class consists of intervals (a^, i = A', • • • , 00. □ 

Lemma 13. For any r > 0, P(r) = (f){p) p'^~^ dp where n is the dimension of uncertainty space. 

Proof. Since is uniformly distributed over B, we can derive the density function of as f\\A.''\\{p) = 

^i^. By definition, P(r) = Pr{I(Zi") = 1} = Pr{I(/\") = 1, < r}. By LemmaH 

m = f m.f\\^-\\{p)dp^ f m'^dp = -^ f <p[p)p-'Up. 

Jo Jo f r Jq 

□ 

The following two lemmas establish connections between 4>{.), P(.) and .^(.). 

Lemma 14. For any continuous interval (a, b) with < a < b, 

, , bF(b) - aP(a) n - 1 f'' ^, , , 
</) p dp = ^ + / P p dp. 

Proof. By Lemma [T3l we have P(r) = ^ J^i 't'ip) p""^ dp. Since (f){p) is continuous over (a, 6), we have 

ti[p"»''(p)i 

that P(r) is diflFerentiable with respect to r and that 0(p) = — ^r=T- for any p e (a, 6). Consequently, 



/ 



M dp = dp 



up 
1 



- d[p-¥{p)] 



la np" 

(B.l) = hm ib-e)nb-^:)-ia + ena + e) ^ n_l 

(B.2) = «^i^+!i^rp(p).p 

where we have used the technique of integration by part in (jB.ip and the fact that P(p) is continuous for 
any p > in (jB.2p . □ 

Lemma 15. For any continuous interval (a, b) with < a < b, 

pb 

(l){p) p"-i dp = [6" ^{b) ~ a" ^(a)] - (n - 1) / ^(p) p^'Vp. 

Ja 

Proof. By LemmaO we have ^{p) = ^ cf>{p) dp. Since 0(p) is continuous over (a, 6), we have that 



is differentiable with respect to p and that (/)(p) = '^[p f'lp)! Jqj. g^j-^y ^ ^ (■q^ j,")^ Hence, 



p"-v(p)rfp = rp"-^d[p^(p)] 



= lim [(6 - e)" ^(6 - e) - (a + e)" ^{a + e)] - / p ^(p) (n - l)p""^dp 

n-1. 



[6" ^(6) - a" ^(a)] - (n - 1) / ^(p) p^'Mp 
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where we have used the technique of integration by part and the faet that !!^{p) is continuous for any 
p > 0. 

□ 



Lemma 16. Let q < a < b < r. Then, \b¥{b) - a¥{a)\ < ^ + ^ 
Proof. Note that, for q < a < 6 < r, we have 

\bP{b) - aP{a)\ = \bP{b) - bF{a) + bF{a) - aP{a)\ 

< b\F{b) -F{a)\ + {b- a)F{a) 

n(b — a) , , / nr \ , 

< 6 ^ ^ ' +{b-a)<i— + lj{b-a) 

where we have used the bound \F{b) — P(a)| < , which was derived in the proof of Theorem 6.1 in 

page 856 of [3]. □ 



Lemma 17. Let q < a < b < r. Then, |6" ^(6) - a" ^{a)\ < (^rl + m'"-i) {b - a). 
Proof. Note that, by Lemma [TUl \^{b) - ^{a)\ < 2^^^, we have 

|6" ^(6)-a" ^(a)| = |6" ^(6) - 6" ^(a) + 6" ^(a) - a" ^(a)| 

< 6"|^(5)-^(a)| + (fo"-a")^(a) 

a 

< ^ (6 -a) 

a 

where we have used the inequahty 6" — a" < — a) which can be shown by using Taylor's expansion 

- a) < a" + nb"~^{b - a) with some ^ S (a, 6). 

□ 



We are now in the position to prove the transform formulas for each cases. 
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Case (1): Let = po < Pi < ■ ■ ■ < Pfc < Pk+i = where pi, ■ 
Lemma [HI we have 



,Pk are fc > discontinuities. By 



4>{p)dp 



Um 

40 



(j){p)dp 



lim / <t>{p) dp^^ 



Pi + l 



Km 



(j){p) dp 

PiP(pi) - eP(e) 71-1 rP^ 



np)dp 



E 



p,+iP(p,+i)-p,P(p,) n-1 /-P'+i 



Hm 



-eP(e) ?i. - 1 



np)dp 



np)dp 



rP(r) ?i. - 1 



V{p)dp. 



Since < P(p) < 1, Vp > 0, we have Hmexo elP(e) = and Um.j^o Xf' ]P(/o)c?P = /o^' Hp)dp- It foUows 
that /; cj^{p)dp = + /; P(p)dp and that ^(r) = i </.(p)dp = ?M + /; p(p)d^. 
By Lemma [T5] and similar techniques, we can show the expression for P(r) in this case. 
Case (2): In this case, the discontinuities can be represented as a monotone decreasing sequence 
{pi\°°^i such that r ~ pq > pi > p2 > ■ ■ ■ > Pk > ■ ■ ■ and hm^^ooPfe = 0. By Lemma [Ml we have 



(j)ip)dp = hm V / dp 



Pi-i 



-.1 -Jp. 



hm > 

i.^^ z — / 



Pz-iP(p.-i)-p^Pfe) , n-1 



p,-i 



¥{p)dp 



lim 



rP(r) - PfcP(pfc) n-1 



¥{p)dp 



Since < P(p) < 1, Vp > and limfe_»ooPfe = 0, we have limfe^oo PfeP(pfc) = and Yuau^r^ J^^ ^{p)dp = 
/;P(p)dp. It foUows that /;</)(p)dp = ^ + ii-i/;p(p)dp and i3^(r) = i /J>(p)dp = ^ + 

By Lemma [T5] and similar techniques, we can show the expression for P(r) in this case. 
Case (3): In this case, let be the smallest positive cluster point. Let q = We can write 
Jo ^{p)dp ~ Jq 4'{p)dp + 4>{p)dp. Applying the result of Case (2), we have J^ 4'{p)dp = + 
J^F{p)dp. We consider (j}{p)dp. For any e > 0, by Lemma [T^ we can write 



{B.3) 




Mdp 



where /^^ means the integration over interval (a, b) and X](a b)£'^ m^ans the summation over all 
intervals of .y^- The notion of X](a b)£j' similar. 

To evaluate J2(^a b)ey J{a b) 4'{p)dp, we arrange the intervals in J^g as (a,;, 6^), i = 1, • • • , fc such 
that a\ — q, hi < a^+i, i = 1, • • • , fc — 1 (Here fc is the total number of intervals). Note that, by 
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Lemma I14[ 



(a.b) 



(t>{p)dp = ^ 



i=l 



¥{p)dp 



(B.4) 



rP(r) - q¥{q) n - 1 



n 

fc-l r 



V{p)dp 



E 



a,+iP(a,+i) -6,P(6,) , n-l 



Qi+l 



P(p)dp 



By Lemma [TCI we have \ai+iF{ai+i) - biF{bi)\ < + ij (oj+i - bi), i = 1, - ■ ■ ,k-l and 



a,+iP(a,;+i) - fe,P(6,) 



(B.5) 



(B.6) 



fc-i 

E 



Since < P(p) < 1, we have 



fe-l r 

< E 

i=l 



hi (flj+l - OjJ 



fe-l 



= (7 + i)E(«^^i-M 



= — + 1 £■ 



n- 1 



fc-i 

E 

i=l 

By dEl, jEH, and (lR6l) . 

fc-i 



ai+l 



np)dp 



fc-i 



n — 1 ■r-^ n — 1 

T7 ^ ^ r) 



E 



a,+iP(a,+i)-6,P(fe,) n-l 



P(p)dp 



< (— + 1)£ 



n-l 



n 



(B.7) 



nr n-l 

— + 1 + 

q n 



e. 



Now we bound b)e.^, I(a b) 't'ip)dp- By Lemmas [HI and [TCI 



E 

{a,b)ey. 



Hp)dp = 

< E 



b¥{b) - a¥{a) n - 1 



¥{p)dp 



nr 



(B.8) 



, nr 
= ( — + 1 



q 

n-l 
n 

n-l 



n-l 



+ 1 ib-a) + (6 -a) 



E (^-«) 



n 



Therefore, by (|R3l) . ((R4)) . (|B7l) and pIS)) . 



p(p)dp 



< 



fe-l 
E 



a,+iP(a,+i) - fe,P(6,) n-l 



< 2 



n 

n — 1 
n 



V{p)dp 



E 



(a,6) 



(f>ip)dp 
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Since the above argument holds for arbitrarily small e > 0, it must be true that 't'ip) dp 
+ /J- V{p)dp. It follows that 



(t){p)dp = / (t){p)dp+ / (t){p)dp 

Jo Jq 



n n 



rP(r) 71 - 1 



n n 



n-1 r rP(r)~qF(q) n-1 T™, ^, 
¥{p)dp + + / V{p)dp 



Pip)dp, 



leading to the formula for 3^{r). 

To show the formula for P(r), recall that r"P(r) — n (f>{p) p"~^ dp. We write 

(B.9) r Hp)p''-' dp = r c^ip)p"~' dp + f ^{p)p''-' dp. 

Jo Jo Jq 

By Lemma [T2| we can write 



(B.io) f\{p)p--Up= I mp''-'dp+ I 



b) 



^ip)p--' dp. 



To evaluate J2(ab)e^ I{a t) '^^P^P"^^'^P^ arrange the intervals in J^^ as {ai,bi), i = I,-- - ,fc 
such that ai = q, bi < ai+i, i = 1, - ■ ■ , fc — 1 (Here k is the total number of intervals). Note that, 
by Lemma [TSl 



Y / Hp)p''-'dp = Y 



b'l^ih) ~ ar^(a,) - (n - 1) / :^{p)p''~'dp 



(B.ll) 



r"^i^(r) - (n- 1) / ^{p)p^-^dp 

Jq 

k-l r 

-E 



a;;Vi^(a,;+i) - 6r^(6,) - (n - 1) / ^{p)p"-^dp 

bi 



By LemmaHZl we have |af_^i ^{a,+i) - bf < ( ^ + nr''-^ ) (oi+i - Hence, 



fc-i 



Y[<+l^ia^+l)-b2^{b^)] 



i=l 



2r" 



fc-i 



(B.12) 



2r" 



nr" I e. 



On the other hand, observing that jj^ ^{p) ^dp < r" ^{b — a), we have 
(B.13) J2 np)p"-'dp < - b^) = r"-'e. 
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By (iRTTj) . (IRT2I) and (IrTsI) . 



2r" 



(B.14) 



Now we bound X](a 6)ej?^, /(a 6) 't'ip)p"' ^^P- Lemmas ITSl and ITTl 



E 



(a,6) 



< 



(B.15) 



E 

E 

2r" 
2r" 

Therefore, by (|RTo1) . (|B341) and jBHJ, 



6"^(&)-a"^(a)-(n-l) / ^{p)p^-^dp 

J a 



2r" 
9 



+ nr"-^ \{h-a)-{n- l)r"-^ (& - a) 



r"-i I e. 



< 



j%{p)p--'dp- 










/2r" 







£ = 2 



2r" 
9 



Since the argument applies to arbitrarily small e > 0, it must be true that Jj^0(p)p" ^dp — 
r''^{r) - - (n - 1) /J" ^{p)p"~^dp . Therefore, 





1 

Tl-1 



(n- 1) 
r"^(r) - (n- 1) 







(n-l) np)p''-'dp 

Jq 



from which we find the formula for P(r). 
Case (4): In this case, let be the smallest positive cluster point. Let q = We can write 
/J (j){p)dp = 4>{p)dp + (j){p)dp. Applying the result of Case (1), we have (t>{p)dp = + 
^ j^¥{p)dp. By a method similar to that of Case (3), we have M dp = ^nr)-<in<i) _^ 
J'JP(/ci) dp. Combining the two integrals gives the formula for l^{r). The proof for the 
formula of P(r) is similar. 
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Appendix C. Proofs of Theorem 3 and 4 

For completeness of argument, we need to quote a general complexity result established in [7] as Theorem 
[5] at below. This theorem concerns the sampling complexity of the Sample Reuse Algorithm proposed in 
page 1963 of [5]. 

Theorem 5. Let d be the dimension of uncertainty parameter space. Then, for arbitrary gridding scheme, 
the equivalent number of grid points based on the Sample Reuse Algorithm [S] is strictly bounded from above 
by 1 + d In A, i.e., m^q < 1 + d InA. 

Proof. We first establish the following inequality (|C.ip that will be used to prove Theorem El 



(C.l) 



Inx > 1, Vx > 1. 



To prove [CTl let /(.t) = i + Ina;. Then /(I) = 1 and = ^ > 0, Va; > 1. It follows that 



dx 



fix) > 1, Vx > 1. 

Now we arc in the position to prove Theorem [5l Observing that (^^f) = n"=T^ ' have 



In 



(tt) =i:::T'ln(^) . Therefore, 



m— 1 / \ d 



i=l 



n+1 



ln( ^ 



E 

i=l 



In 



m—1. 



Since > 1, i 1, • • • , m- 1, it follows from ((CT|) that -^^r^^+ln (^^) > 1 for i = 1, 

Hence, (77tr)' + ln (^)' > ^-1, or equivalently, m~j:T=i' (77^7)' < 1 + 1^(17)' = l + ^lnA 

Finally, by Theorem 1 of [5] and the definition of rrieq, we have meq = m — J2iLi^ \r^) ^ 1 + din A 



□ 



C.l. Proof of Theorem H By Lemma [3 \^{r)- ^*{r)\ < llli±i:ill)^ Vr € [rj,r,+i]. Thus, it suffices 



to show ^ ('''+'- ''') < ^ [ q 

(C.2) ^<1 + |- 

By the definition of uniform griding, for i = l,---,rn— 1, 

(m-»-l)(A-l) 



' i+1 



(m-»)(A-i) ^ m - 1 + (A - l)(i - 1) ~ ?n - 1 ■ 

{?n — 1)A 



By virtue of (|C.2p . to guarantee that the gridding error is less than e, it suffices to ensure 1 + < 1 
i.e., m > 1 + Hi^— U.. Hence, it suffices to have m > 2 + ^'-'^ "^"^ . It can be verified that = 1 — -rr- 

' « ' — L ' J ri^i .21 

for i ~ 1, - ■ ■ ,m — 1. 

Let n''' be the total number of simulations on the direction associated with directional sample , k ~ 
1, • • • , TV. Applying Theorem 1 of and Theorem [S] in this paper to a sample reuse process conditioned 
upon a direction with grid points ri, • • • , and sample size = 1, we have E[n'^ | U^] = rn—Y^^^ < 
1 + dlnA and consequently E[n''] = E[E[n'^' | U'']] = m-EllT^ < 1 + din A for fc = 1, • • • , iV. Finally, 
the proof is completed by invoking the definition of equivalent number of grid points. 
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C.2. Proof of Theorem |4l By the definition of uniform griding, wc have = A"-i . Hence, by (|C.2p . 
it suffices to show A™-i < 1 + §, which can be reduced to m > 1 + ^^J"_^, \ . This inequahty is equivalent 



In A 



By letting n be the total number of simulations on the direction associated with 



to m > 2 , , 

ln(H 

directional sample U'', k = I,-- - ,N and applying Theorem 1 of [5] and Theorem [5] in this paper to a 
sample reuse process conditioned upon a direction with grid points ri , • • • , r„i and sample size TV = 1 , 
we have E[n'= | U^] = m - (m - 1) (i)^ < 1 + din A and consequently E[n'=] = E[E[n'= | U'']] = 
m — (m — 1) (j) '""^ < 1 + din A for fc = 1, • • • , iV. The proof is completed by using the definition of 
equivalent number of grid points. 
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